An Algorithm for Voiced / Unvoiced Decision and Pitch Estimation in Speech Feature Extraction
نویسنده
چکیده
An algorithm which combines voice / unvoiced decision and pitch estimation is proposed in an enhanced process of MFCC feature extraction. The residual energy of LPC analysis and normalized autocorrelation are calculated and the static and dynamic thresholds are set for the voiced, unvoiced and transitional decision. Thus speech is divided into three classes that are voiced, unvoiced and transitional. Then the pitch is estimated by a dynamic programming (DP) algorithm. In the following harmonic peak picking process, the result is refined by the additional spectral information. The algorithm is empowered by the finite state machine (FSM) embedded in U/V decision which can convert the static thresholds to dynamical variable thresholds and represent the actual speech more exactly. Experiments also show that performance gains of word recognition rate from 71.49% to 74.42% in the National 863 standard Mandarin speech Corpus.
منابع مشابه
Improve the Implementation of Pitch Features for Mandarin Digit String Recognition Task
Mandarin digit string recognition (MDSR) is a difficult task in the field of automatic speech recognition (ASR) and using pitch feature can significantly improve the performance. In conventional methods of pitch feature extraction, random value is commonly used as pitch output in unvoiced (UV) frames, which causes serious statistical confusion between voiced (V) and UV units and incurs abnormal...
متن کاملYour Wavelet Based Pitch Detection and Voiced/unvoiced Decision
This paper describes property of the sudden change of a speech signal on its Glottal Closure Instant (GCI) and thereby discusses the principle of the localization of wavelets in both time and frequency domains. Based on this discussion, an algorithm for voiced/unvoiced segment decision and pitch detection is presented.
متن کاملA novel approach of low bit-rate speech coding based on sinusoidal representation and auditory model
In this paper, a new auditory spectrum based speech feature is proposed using sinusoidal representation and auditory model. The feature is optimized using the properties of auditory perception and masking. After quantizing and encoding the optimized feature parameters, a new speech-coding algorithm with average bit-rate of 3.25kbps is developed. The experimental results show that the synthetic ...
متن کاملAn algorithm for multi-pitch tracking in co-channel speech
Most multi-pitch algorithms are tested for performance only in voiced regions of speech, and are prone to yield pitch estimates even when the participating speakers are unvoiced. This paper presents a multi-pitch algorithm that detects the voiced and unvoiced regions in a mixture of two speakers, identifies the number of speakers in voiced regions, and yields the pitch estimates of each speaker...
متن کاملDecomposition of Speech into Voiced and Unvoiced Components Based on a Kalman Filterbank
We present a novel method for decomposing speech into signals representing the voiced and unvoiced components of speech. The method involves first demodulating the variations in spectral envelope, energy and pitch, and then applying a bank of Kalman filters to separate the harmonic and non-harmonic components of the signal. The use of Kalman filters relies on a state-space representation of the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002